5 research outputs found

    Automated Anomaly Detection in Virtualized Services Using Deep Packet Inspection

    Get PDF
    Virtualization technologies have proven to be important drivers for the fast and cost-efficient development and deployment of services. While the benefits are tremendous, there are many challenges to be faced when developing or porting services to virtualized infrastructure. Especially critical applications like Virtualized Network Functions must meet high requirements in terms of reliability and resilience. An important tool when meeting such requirements is detecting anomalous system components and recovering the anomaly before it turns into a fault and subsequently into a failure visible to the client. Anomaly detection for virtualized services relies on collecting system metrics that represent the normal operation state of every component and allow the usage of machine learning algorithms to automatically build models representing such state. This paper presents an approach for collecting service-layer metrics while treating services as black-boxes. This allows service providers to implement anomaly detection on the application layer without the need to modify third-party software. Deep Packet Inspection is used to analyse the traffic of virtual machines on the hypervisor layer, producing both generic and protocol-specific communication metrics. An evaluation shows that the resulting metrics represent the normal operation state of an example Virtualized Network Function and are therefore a valuable contribution to automatic anomaly detection in virtualized services

    A System Architecture for Real-time Anomaly Detection in Large-scale NFV Systems

    Get PDF
    Virtualization as a key IT technology has developed to a predominant model in data centers in recent years. The flexibility regarding scaling-out and migration of virtual machines for seamless maintenance has enabled a new level of continuous operation and changed service provisioning significantly. Meanwhile, services from domains striving for highest possible availability – e.g. from the telecommunications domain – are adopting this approach as well and are investing significant efforts into the development of Network Function Virtualization (NFV). However, the availability requirements for such infrastructures are much higher than typical for IT services built upon standard software with off-the-shelf hardware. They require sophisticated methods and mechanisms for fast detection and recovery of failures. This paper presents a set of methods and an implemented prototype for anomaly detection in cloud-based infrastructures with specific focus on the deployment of virtualized network functions. The framework is built upon OpenStack, which is the current de-facto standard of open-source cloud software and aims at increasing the availability and fault tolerance level by providing an extensive monitoring and analysis pipeline able to detect failures or degraded performance in real-time. The indicators for anomalies are created using supervised and non-supervised classification methods and preliminary experimental measurements showed a high percentage of correctly identified anomaly situations. After a successful failure detection, a set of pre-defined countermeasures is activated in order to mask or repair outages or situations with degraded performance

    Autonome Selbstheilung in Cloud-Computing-Plattformen

    No full text
    The demand for increasingly rich services with high-level abstractions drives the field of cloud computing. In both research and industry, additional layers and components continuously increase the complexity of these modern platforms. In truth, the size of cloud systems has long surpassed what human administrators are able to manage. Nonetheless, users and customers expect high availability and reliability from both the applications and the underlying platform, which is only possible through automation. Today, most automated dependability techniques focus on increasing the availability of distributed systems by preventing or masking component outages. However, both software and hardware components often exhibit a behavior where the delivered service degrades without becoming entirely unavailable. Such anomalies, also called gray failures or degraded states, originate from software bugs or other unforeseen issues with the system. Some application-specific systems attempt to handle certain types of anomalies by applying a pre-defined set of rules. In general, however, administrators have to resolve anomaly situations manually. In practice, there is no technique or system for detecting and resolving anomaly situations in a generic way. Accordingly, this thesis suggests an extension to traditional cloud infrastructures by providing self-healing functionalities. Our approach monitors live data streams collected from all critical system components and analyzes the collected data for anomalous be- havior. Once an anomaly is detected, the system further investigates the situation, determines the root cause, and automatically implements a remediation plan to resolve the problem. We analyze the requirements to build such a self-healing system and present an abstract system architecture that fulfills the given requirements. The proposed self-healing cloud provides administrators with a coherent set of configuration values that determine the level to which remediation workflows are executed automatically. Further, we design a data analysis engine that executes the necessary processing tasks while co-existing with the cloud workload, and that, without disrupting it. Finally, we apply the abstract architecture to the scenario of a public cloud platform and present a prototypical implementation of the named concepts. We evaluate various properties of our implementation in a practical experimental testbed and a qualitative analysis.Die steigende Nachfrage nach Diensten mit immer höheren Abstraktionsebenen bestimmt heutzutage das Gebiet des Cloud Computing. Sowohl in der Forschung, als auch in der Industrie, führen zusätzliche Abstraktionsschichten und Komponenten zu wachsender Komplexität moderner IT-Systeme. Längst hat die schiere Größe von Cloudsystemen die Grenze des von Menschen Beherrschbaren überschritten. Nutzer und Kunden von Cloudplattformen erwarten dennoch ein hohes Maß an Zuverlässigkeit und Ausfallsicherheit, sowohl von der Plattform, als auch von den darin ausgeführten Anwendungen. Dies lässt sich nur mithilfe von Automatisierungslösungen erreichen. Die meisten automatischen Lösungen für die Zuverlässigkeit von verteilten Systemen basieren darauf, Ausfälle von Teilkomponenten zu verhindern oder zu verschleiern. Dabei wird übersehen, dass sowohl Hardware- als auch Software-Komponenten auch ein degradiertes Verhalten aufweisen können, ohne komplett auszufallen. Solche Fälle, auch Anomalien genannt, entstehen häufig aus Fehlern im Programmcode einer Applikation, oder durch andere unvorhergesehene Umstände im System. Anwendungsspezifische Systeme für Anomalieerkennung und -behebung behandeln bestimmte Typen von Anomalien, basierend auf manuell festgelegten Regeln. Im Normalfall müssen Administratoren solche Anomaliefälle aber manuell behandeln. Momentan gibt es kein System in praktischer Benutzung, welches Anomalien erkennt und behebt, ohne dabei Annahmen über die überwachte Anwendung zu treffen. Daher schlägt diese Dissertation eine Erweiterung von traditionellen Cloudplattformen vor, die solche Plattformen um selbstheilende Fähigkeiten erweitern. Unser Ansatz basiert auf Echtzeitdatenströmen, die von allen kritischen Komponenten des systems erfasst werden. Diese Datenströme werden kontinuierlich analysiert, um festzustellen, ob die jeweilige Komponente sich normal verhält, oder eine Anomalie aufweist. Sobald eine Anomalie erkannt wird, wird die Situation automatisch weiter untersucht, die Ursache der Anomalie gefunden, und automatisch eine Gegenmaßnahme eingeleitet, um das Problem zu beheben. In dieser Arbeit analysieren wir die Anforderungen, die ein solches System erfüllen muss, und stellen eine abstrakte Systemarchitektur vor, die diese Anforderungen erfüllt. Diese selbstheilende Cloudarchitektur bietet Administratoren verständliche Konfigurationsparameter, die bestimmen, zu welchem Maß Reparaturaktionen automatisch ausgeführt werden. Desweiteren definieren wir das Konzept von “In Situ Datenanalyse”, die verwendet wird, um die Datenströme in der selbstheilenden Cloud effizient auszuwerten, ohne die Plattform bei ihren eigentlichen Aufgaben zu behindern. In einer prototypischen Umsetzung der abstrakten Konzepte zeigen wir, dass die selbstheilende Cloudarchitektur im praktischen Anwendungsfall eines öffentlichen Clouddienstes anwendbar ist. In einer experimentellen Testumgebung messen wir diverse Eigenschaften unserer Plattform, und erweitern diese Evaluierung durch eine qualitative Analyse

    Orca: A Single-Language Web Framework for Collaborative Development

    No full text
    In the last few years, the Web has been established as a platform for interactive applications. However, creating Web applications involves numerous challenges since the Web has been created to serve static content. In particular, the separation of the client- and the server-side, being only connected through the unidirectional Hypertext Transfer Protocol, forces developers to apply two programming languages including different libraries, conventions, and tools. Developers create expert knowledge by specializing on a few of all involved technologies. Consequently, the diverse knowledge of team members makes collaboration in Web development laboriously. We present the Orca framework that allows developers to work collaboratively on client-server applications in a single object-oriented programming language. Based on the Smalltalk programming language, full access to existing libraries, and a bidirectional messaging abstraction, Orca provides a consistent environment that supports common idioms and patterns in client- and server-side code. It reduces expert knowledge and the number of development tools and, thus, facilitates the collaboration of Web developers
    corecore